Goto

Collaborating Authors

 multi-view contrastive learning


Self-Weighted Contrastive Learning among Multiple Views for Mitigating Representation Degeneration

Neural Information Processing Systems

Contrastive learning (CL) explicitly enlarges the feature representation similarity between semantic-relevant samples, and it is adept at capturing high-level semantics while discarding irrelevant information.


Multi-View Contrastive Learning for Robust Domain Adaptation in Medical Time Series Analysis

Oh, YongKyung, Bui, Alex

arXiv.org Artificial Intelligence

Adapting machine learning models to medical time series across different domains remains a challenge due to complex temporal dependencies and dynamic distribution shifts. Current approaches often focus on isolated feature representations, limiting their ability to fully capture the intricate temporal dynamics necessary for robust domain adaptation. In this work, we propose a novel framework leveraging multi-view contrastive learning to integrate temporal patterns, derivative-based dynamics, and frequency-domain features. Our method employs independent encoders and a hierarchical fusion mechanism to learn feature-invariant representations that are transferable across domains while preserving temporal coherence. Extensive experiments on diverse medical datasets, including electroencephalogram (EEG), electrocardiogram (ECG), and electromyography (EMG) demonstrate that our approach significantly outperforms state-of-the-art methods in transfer learning tasks. By advancing the robustness and generalizability of machine learning models, our framework offers a practical pathway for deploying reliable AI systems in diverse healthcare settings. Data and Code Availability This study uses publicly available datasets in medical and healthcare domains, including SleepEEG (Kemp et al., 2000) and ECG (Clifford et al., 2017) for pre-training, and Epilepsy (Andrzejak et al., 2001), FD (Less-meier et al., 2016), Gesture (Liu et al., 2009), and EMG (Goldberger et al., 2000) for fine-tuning. The datasets used in this study are publicly accessible via their respective repositories, with detailed documentation included in the supplementary material.


Exploring Self-Supervised Multi-view Contrastive Learning for Speech Emotion Recognition with Limited Annotations

Khaertdinov, Bulat, Jeuris, Pedro, Sousa, Annanda, Hortal, Enrique

arXiv.org Artificial Intelligence

Recent advancements in Deep and Self-Supervised Learning (SSL) have led to substantial improvements in Speech Emotion Recognition (SER) performance, reaching unprecedented levels. However, obtaining sufficient amounts of accurately labeled data for training or fine-tuning the models remains a costly and challenging task. In this paper, we propose a multi-view SSL pre-training technique that can be applied to various representations of speech, including the ones generated by large speech models, to improve SER performance in scenarios where annotations are limited. Our experiments, based on wav2vec 2.0, spectral and paralinguistic features, demonstrate that the proposed framework boosts the SER performance, by up to 10% in Unweighted Average Recall, in settings with extremely sparse data annotations.


Multi-view Contrastive Learning for Entity Typing over Knowledge Graphs

Hu, Zhiwei, Gutiérrez-Basulto, Víctor, Xiang, Zhiliang, Li, Ru, Pan, Jeff Z.

arXiv.org Artificial Intelligence

Knowledge graph entity typing (KGET) aims at inferring plausible types of entities in knowledge graphs. Existing approaches to KGET focus on how to better encode the knowledge provided by the neighbors and types of an entity into its representation. However, they ignore the semantic knowledge provided by the way in which types can be clustered together. In this paper, we propose a novel method called Multi-view Contrastive Learning for knowledge graph Entity Typing (MCLET), which effectively encodes the coarse-grained knowledge provided by clusters into entity and type embeddings. MCLET is composed of three modules: i) Multi-view Generation and Encoder module, which encodes structured information from entity-type, entity-cluster and cluster-type views; ii) Cross-view Contrastive Learning module, which encourages different views to collaboratively improve view-specific representations of entities and types; iii) Entity Typing Prediction module, which integrates multi-head attention and a Mixture-of-Experts strategy to infer missing entity types. Extensive experiments show the strong performance of MCLET compared to the state-of-the-art